Dataset statistics
| Number of variables | 9 |
|---|---|
| Number of observations | 8991 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 632.3 KiB |
| Average record size in memory | 72.0 B |
Variable types
| Numeric | 9 |
|---|
CO(GT) is highly correlated with PT08.S1(CO) and 7 other fields | High correlation |
PT08.S1(CO) is highly correlated with CO(GT) and 7 other fields | High correlation |
C6H6(GT) is highly correlated with CO(GT) and 7 other fields | High correlation |
PT08.S2(NMHC) is highly correlated with CO(GT) and 7 other fields | High correlation |
NOx(GT) is highly correlated with CO(GT) and 6 other fields | High correlation |
PT08.S3(NOx) is highly correlated with CO(GT) and 7 other fields | High correlation |
NO2(GT) is highly correlated with CO(GT) and 6 other fields | High correlation |
PT08.S4(NO2) is highly correlated with CO(GT) and 5 other fields | High correlation |
PT08.S5(O3) is highly correlated with CO(GT) and 7 other fields | High correlation |
CO(GT) is highly correlated with PT08.S1(CO) and 7 other fields | High correlation |
PT08.S1(CO) is highly correlated with CO(GT) and 7 other fields | High correlation |
C6H6(GT) is highly correlated with CO(GT) and 7 other fields | High correlation |
PT08.S2(NMHC) is highly correlated with CO(GT) and 7 other fields | High correlation |
NOx(GT) is highly correlated with CO(GT) and 6 other fields | High correlation |
PT08.S3(NOx) is highly correlated with CO(GT) and 7 other fields | High correlation |
NO2(GT) is highly correlated with CO(GT) and 6 other fields | High correlation |
PT08.S4(NO2) is highly correlated with CO(GT) and 5 other fields | High correlation |
PT08.S5(O3) is highly correlated with CO(GT) and 7 other fields | High correlation |
CO(GT) is highly correlated with PT08.S1(CO) and 6 other fields | High correlation |
PT08.S1(CO) is highly correlated with CO(GT) and 4 other fields | High correlation |
C6H6(GT) is highly correlated with CO(GT) and 5 other fields | High correlation |
PT08.S2(NMHC) is highly correlated with CO(GT) and 5 other fields | High correlation |
NOx(GT) is highly correlated with CO(GT) and 3 other fields | High correlation |
PT08.S3(NOx) is highly correlated with CO(GT) and 5 other fields | High correlation |
NO2(GT) is highly correlated with CO(GT) and 1 other fields | High correlation |
PT08.S4(NO2) is highly correlated with C6H6(GT) and 1 other fields | High correlation |
PT08.S5(O3) is highly correlated with CO(GT) and 5 other fields | High correlation |
C6H6(GT) is highly correlated with PT08.S4(NO2) and 7 other fields | High correlation |
PT08.S4(NO2) is highly correlated with C6H6(GT) and 5 other fields | High correlation |
CO(GT) is highly correlated with C6H6(GT) and 7 other fields | High correlation |
NOx(GT) is highly correlated with C6H6(GT) and 6 other fields | High correlation |
PT08.S2(NMHC) is highly correlated with C6H6(GT) and 7 other fields | High correlation |
PT08.S1(CO) is highly correlated with C6H6(GT) and 7 other fields | High correlation |
PT08.S5(O3) is highly correlated with C6H6(GT) and 7 other fields | High correlation |
NO2(GT) is highly correlated with C6H6(GT) and 6 other fields | High correlation |
PT08.S3(NOx) is highly correlated with C6H6(GT) and 7 other fields | High correlation |
Reproduction
| Analysis started | 2021-07-01 14:50:42.201667 |
|---|---|
| Analysis finished | 2021-07-01 14:50:58.141549 |
| Duration | 15.94 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 94 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.069313758 |
| Minimum | 0.1 |
|---|---|
| Maximum | 11.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 0.5 |
| Q1 | 1.2 |
| median | 1.8 |
| Q3 | 2.6 |
| 95-th percentile | 4.6 |
| Maximum | 11.9 |
| Range | 11.8 |
| Interquartile range (IQR) | 1.4 |
Descriptive statistics
| Standard deviation | 1.304487151 |
|---|---|
| Coefficient of variation (CV) | 0.6303960167 |
| Kurtosis | 4.082047789 |
| Mean | 2.069313758 |
| Median Absolute Deviation (MAD) | 0.6 |
| Skewness | 1.616442422 |
| Sum | 18605.2 |
| Variance | 1.701686726 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.8 | 1822 | 20.3% |
| 1 | 287 | 3.2% |
| 1.4 | 269 | 3.0% |
| 1.5 | 265 | 2.9% |
| 1.6 | 264 | 2.9% |
| 0.7 | 252 | 2.8% |
| 1.1 | 251 | 2.8% |
| 1.3 | 248 | 2.8% |
| 0.8 | 243 | 2.7% |
| 0.9 | 241 | 2.7% |
| Other values (84) | 4849 |
| Value | Count | Frequency (%) |
| 0.1 | 33 | 0.4% |
| 0.2 | 45 | 0.5% |
| 0.3 | 97 | 1.1% |
| 0.4 | 160 | |
| 0.5 | 217 | |
| 0.6 | 238 | |
| 0.7 | 252 | |
| 0.8 | 243 | |
| 0.9 | 241 | |
| 1 | 287 |
| Value | Count | Frequency (%) |
| 11.9 | 1 | < 0.1% |
| 11.5 | 1 | < 0.1% |
| 10.2 | 2 | |
| 10.1 | 1 | < 0.1% |
| 9.9 | 1 | < 0.1% |
| 9.5 | 1 | < 0.1% |
| 9.4 | 1 | < 0.1% |
| 9.2 | 1 | < 0.1% |
| 9.1 | 1 | < 0.1% |
| 8.7 | 3 |
| Distinct | 1041 |
|---|---|
| Distinct (%) | 11.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1099.833166 |
| Minimum | 647 |
|---|---|
| Maximum | 2040 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 647 |
|---|---|
| 5-th percentile | 810.5 |
| Q1 | 937 |
| median | 1063 |
| Q3 | 1231 |
| 95-th percentile | 1508 |
| Maximum | 2040 |
| Range | 1393 |
| Interquartile range (IQR) | 294 |
Descriptive statistics
| Standard deviation | 217.0800373 |
|---|---|
| Coefficient of variation (CV) | 0.1973754237 |
| Kurtosis | 0.3351286502 |
| Mean | 1099.833166 |
| Median Absolute Deviation (MAD) | 142 |
| Skewness | 0.7559073724 |
| Sum | 9888600 |
| Variance | 47123.74258 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 973 | 30 | 0.3% |
| 1100 | 28 | 0.3% |
| 925 | 26 | 0.3% |
| 969 | 26 | 0.3% |
| 938 | 26 | 0.3% |
| 988 | 26 | 0.3% |
| 970 | 25 | 0.3% |
| 1053 | 25 | 0.3% |
| 966 | 25 | 0.3% |
| 987 | 25 | 0.3% |
| Other values (1031) | 8729 |
| Value | Count | Frequency (%) |
| 647 | 1 | < 0.1% |
| 649 | 1 | < 0.1% |
| 655 | 1 | < 0.1% |
| 667 | 3 | |
| 669 | 1 | < 0.1% |
| 676 | 1 | < 0.1% |
| 678 | 1 | < 0.1% |
| 679 | 1 | < 0.1% |
| 681 | 1 | < 0.1% |
| 683 | 2 |
| Value | Count | Frequency (%) |
| 2040 | 1 | |
| 2008 | 1 | |
| 1982 | 1 | |
| 1975 | 1 | |
| 1973 | 1 | |
| 1961 | 1 | |
| 1956 | 1 | |
| 1934 | 1 | |
| 1918 | 1 | |
| 1917 | 1 |
| Distinct | 407 |
|---|---|
| Distinct (%) | 4.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.08310533 |
| Minimum | 0.1 |
|---|---|
| Maximum | 63.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 1.7 |
| Q1 | 4.4 |
| median | 8.2 |
| Q3 | 14 |
| 95-th percentile | 24.65 |
| Maximum | 63.7 |
| Range | 63.6 |
| Interquartile range (IQR) | 9.6 |
Descriptive statistics
| Standard deviation | 7.449819698 |
|---|---|
| Coefficient of variation (CV) | 0.7388418008 |
| Kurtosis | 2.488705886 |
| Mean | 10.08310533 |
| Median Absolute Deviation (MAD) | 4.4 |
| Skewness | 1.36153227 |
| Sum | 90657.2 |
| Variance | 55.49981354 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.6 | 84 | 0.9% |
| 2.8 | 82 | 0.9% |
| 3.8 | 79 | 0.9% |
| 4 | 78 | 0.9% |
| 3.1 | 77 | 0.9% |
| 3 | 76 | 0.8% |
| 2.5 | 75 | 0.8% |
| 2.9 | 73 | 0.8% |
| 5.4 | 72 | 0.8% |
| 6 | 71 | 0.8% |
| Other values (397) | 8224 |
| Value | Count | Frequency (%) |
| 0.1 | 2 | < 0.1% |
| 0.2 | 8 | 0.1% |
| 0.3 | 10 | 0.1% |
| 0.4 | 14 | |
| 0.5 | 20 | |
| 0.6 | 23 | |
| 0.7 | 31 | |
| 0.8 | 25 | |
| 0.9 | 25 | |
| 1 | 30 |
| Value | Count | Frequency (%) |
| 63.7 | 1 | |
| 52.1 | 1 | |
| 50.8 | 1 | |
| 50.7 | 1 | |
| 50.6 | 1 | |
| 49.5 | 1 | |
| 49.4 | 1 | |
| 48.2 | 1 | |
| 47.7 | 1 | |
| 47.5 | 1 |
| Distinct | 1245 |
|---|---|
| Distinct (%) | 13.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 939.1533756 |
| Minimum | 383 |
|---|---|
| Maximum | 2214 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 383 |
|---|---|
| 5-th percentile | 562 |
| Q1 | 734.5 |
| median | 909 |
| Q3 | 1116 |
| 95-th percentile | 1420 |
| Maximum | 2214 |
| Range | 1831 |
| Interquartile range (IQR) | 381.5 |
Descriptive statistics
| Standard deviation | 266.8314286 |
|---|---|
| Coefficient of variation (CV) | 0.2841191179 |
| Kurtosis | 0.06324387318 |
| Mean | 939.1533756 |
| Median Absolute Deviation (MAD) | 188 |
| Skewness | 0.56156598 |
| Sum | 8443928 |
| Variance | 71199.01129 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 853 | 25 | 0.3% |
| 859 | 23 | 0.3% |
| 880 | 23 | 0.3% |
| 800 | 23 | 0.3% |
| 985 | 22 | 0.2% |
| 769 | 21 | 0.2% |
| 850 | 21 | 0.2% |
| 776 | 21 | 0.2% |
| 783 | 21 | 0.2% |
| 1012 | 20 | 0.2% |
| Other values (1235) | 8771 |
| Value | Count | Frequency (%) |
| 383 | 2 | |
| 387 | 1 | |
| 388 | 1 | |
| 390 | 2 | |
| 397 | 1 | |
| 399 | 1 | |
| 402 | 2 | |
| 407 | 2 | |
| 408 | 1 | |
| 409 | 1 |
| Value | Count | Frequency (%) |
| 2214 | 1 | |
| 2007 | 1 | |
| 1983 | 1 | |
| 1981 | 1 | |
| 1980 | 1 | |
| 1959 | 1 | |
| 1958 | 1 | |
| 1935 | 1 | |
| 1924 | 1 | |
| 1920 | 1 |
| Distinct | 898 |
|---|---|
| Distinct (%) | 10.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 230.8021355 |
| Minimum | 2 |
|---|---|
| Maximum | 1479 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 41 |
| Q1 | 112 |
| median | 178 |
| Q3 | 280 |
| 95-th percentile | 635 |
| Maximum | 1479 |
| Range | 1477 |
| Interquartile range (IQR) | 168 |
Descriptive statistics
| Standard deviation | 188.7172102 |
|---|---|
| Coefficient of variation (CV) | 0.8176579901 |
| Kurtosis | 4.952450426 |
| Mean | 230.8021355 |
| Median Absolute Deviation (MAD) | 76 |
| Skewness | 1.995177294 |
| Sum | 2075142 |
| Variance | 35614.18543 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 178 | 1617 | 18.0% |
| 89 | 39 | 0.4% |
| 93 | 36 | 0.4% |
| 65 | 35 | 0.4% |
| 180 | 35 | 0.4% |
| 132 | 35 | 0.4% |
| 122 | 35 | 0.4% |
| 95 | 34 | 0.4% |
| 41 | 34 | 0.4% |
| 51 | 33 | 0.4% |
| Other values (888) | 7058 |
| Value | Count | Frequency (%) |
| 2 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 10 | 3 | |
| 11 | 4 | |
| 12 | 4 | |
| 13 | 4 |
| Value | Count | Frequency (%) |
| 1479 | 1 | |
| 1389 | 2 | |
| 1369 | 1 | |
| 1358 | 1 | |
| 1345 | 1 | |
| 1301 | 1 | |
| 1290 | 1 | |
| 1247 | 1 | |
| 1230 | 1 | |
| 1220 | 1 |
| Distinct | 1221 |
|---|---|
| Distinct (%) | 13.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 835.4936047 |
| Minimum | 322 |
|---|---|
| Maximum | 2683 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 322 |
|---|---|
| 5-th percentile | 483 |
| Q1 | 658 |
| median | 806 |
| Q3 | 969.5 |
| 95-th percentile | 1291 |
| Maximum | 2683 |
| Range | 2361 |
| Interquartile range (IQR) | 311.5 |
Descriptive statistics
| Standard deviation | 256.81732 |
|---|---|
| Coefficient of variation (CV) | 0.3073839447 |
| Kurtosis | 2.677558895 |
| Mean | 835.4936047 |
| Median Absolute Deviation (MAD) | 155 |
| Skewness | 1.101729235 |
| Sum | 7511923 |
| Variance | 65955.13586 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 846 | 25 | 0.3% |
| 767 | 25 | 0.3% |
| 733 | 25 | 0.3% |
| 876 | 23 | 0.3% |
| 765 | 23 | 0.3% |
| 685 | 22 | 0.2% |
| 830 | 22 | 0.2% |
| 872 | 22 | 0.2% |
| 816 | 22 | 0.2% |
| 720 | 22 | 0.2% |
| Other values (1211) | 8760 |
| Value | Count | Frequency (%) |
| 322 | 1 | |
| 325 | 2 | |
| 328 | 1 | |
| 330 | 2 | |
| 334 | 1 | |
| 335 | 1 | |
| 340 | 2 | |
| 341 | 1 | |
| 345 | 1 | |
| 346 | 1 |
| Value | Count | Frequency (%) |
| 2683 | 1 | |
| 2559 | 1 | |
| 2542 | 1 | |
| 2331 | 1 | |
| 2327 | 1 | |
| 2318 | 1 | |
| 2294 | 1 | |
| 2121 | 1 | |
| 2095 | 2 | |
| 2081 | 1 |
| Distinct | 274 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 111.5861417 |
| Minimum | 2 |
|---|---|
| Maximum | 333 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 45 |
| Q1 | 85 |
| median | 109 |
| Q3 | 132 |
| 95-th percentile | 193 |
| Maximum | 333 |
| Range | 331 |
| Interquartile range (IQR) | 47 |
Descriptive statistics
| Standard deviation | 43.2058078 |
|---|---|
| Coefficient of variation (CV) | 0.3871968969 |
| Kurtosis | 1.07474443 |
| Mean | 111.5861417 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | 0.6708264814 |
| Sum | 1003271 |
| Variance | 1866.741828 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 109 | 1659 | 18.5% |
| 97 | 76 | 0.8% |
| 119 | 74 | 0.8% |
| 114 | 74 | 0.8% |
| 101 | 74 | 0.8% |
| 110 | 73 | 0.8% |
| 117 | 73 | 0.8% |
| 95 | 73 | 0.8% |
| 115 | 70 | 0.8% |
| 116 | 69 | 0.8% |
| Other values (264) | 6676 |
| Value | Count | Frequency (%) |
| 2 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 5 | 2 | < 0.1% |
| 7 | 1 | < 0.1% |
| 8 | 2 | < 0.1% |
| 9 | 2 | < 0.1% |
| 11 | 2 | < 0.1% |
| 12 | 2 | < 0.1% |
| 13 | 1 | < 0.1% |
| 14 | 5 |
| Value | Count | Frequency (%) |
| 333 | 1 | < 0.1% |
| 322 | 1 | < 0.1% |
| 310 | 1 | < 0.1% |
| 309 | 1 | < 0.1% |
| 306 | 1 | < 0.1% |
| 301 | 1 | < 0.1% |
| 295 | 1 | < 0.1% |
| 288 | 2 | |
| 285 | 1 | < 0.1% |
| 283 | 3 |
| Distinct | 1603 |
|---|---|
| Distinct (%) | 17.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1456.264598 |
| Minimum | 551 |
|---|---|
| Maximum | 2775 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 551 |
|---|---|
| 5-th percentile | 883 |
| Q1 | 1227 |
| median | 1463 |
| Q3 | 1674 |
| 95-th percentile | 2029 |
| Maximum | 2775 |
| Range | 2224 |
| Interquartile range (IQR) | 447 |
Descriptive statistics
| Standard deviation | 346.2067935 |
|---|---|
| Coefficient of variation (CV) | 0.2377361875 |
| Kurtosis | 0.07801862433 |
| Mean | 1456.264598 |
| Median Absolute Deviation (MAD) | 221 |
| Skewness | 0.2053885254 |
| Sum | 13093275 |
| Variance | 119859.1439 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1488 | 24 | 0.3% |
| 1580 | 22 | 0.2% |
| 1539 | 21 | 0.2% |
| 1467 | 20 | 0.2% |
| 1638 | 19 | 0.2% |
| 1490 | 18 | 0.2% |
| 1418 | 18 | 0.2% |
| 1321 | 17 | 0.2% |
| 1511 | 17 | 0.2% |
| 1435 | 17 | 0.2% |
| Other values (1593) | 8798 |
| Value | Count | Frequency (%) |
| 551 | 1 | |
| 559 | 1 | |
| 561 | 1 | |
| 579 | 1 | |
| 601 | 1 | |
| 602 | 1 | |
| 605 | 1 | |
| 621 | 1 | |
| 637 | 1 | |
| 640 | 1 |
| Value | Count | Frequency (%) |
| 2775 | 1 | |
| 2746 | 1 | |
| 2691 | 1 | |
| 2684 | 1 | |
| 2679 | 1 | |
| 2667 | 1 | |
| 2665 | 1 | |
| 2662 | 1 | |
| 2643 | 2 | |
| 2641 | 2 |
| Distinct | 1743 |
|---|---|
| Distinct (%) | 19.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1022.906128 |
| Minimum | 221 |
|---|---|
| Maximum | 2523 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 70.4 KiB |
Quantile statistics
| Minimum | 221 |
|---|---|
| 5-th percentile | 461 |
| Q1 | 731.5 |
| median | 963 |
| Q3 | 1273.5 |
| 95-th percentile | 1761.5 |
| Maximum | 2523 |
| Range | 2302 |
| Interquartile range (IQR) | 542 |
Descriptive statistics
| Standard deviation | 398.4842877 |
|---|---|
| Coefficient of variation (CV) | 0.3895609545 |
| Kurtosis | 0.07861233923 |
| Mean | 1022.906128 |
| Median Absolute Deviation (MAD) | 261 |
| Skewness | 0.6278644976 |
| Sum | 9196949 |
| Variance | 158789.7276 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 836 | 20 | 0.2% |
| 825 | 20 | 0.2% |
| 826 | 19 | 0.2% |
| 926 | 18 | 0.2% |
| 799 | 17 | 0.2% |
| 777 | 17 | 0.2% |
| 891 | 16 | 0.2% |
| 905 | 16 | 0.2% |
| 949 | 16 | 0.2% |
| 923 | 16 | 0.2% |
| Other values (1733) | 8816 |
| Value | Count | Frequency (%) |
| 221 | 1 | |
| 225 | 1 | |
| 227 | 1 | |
| 232 | 1 | |
| 252 | 1 | |
| 253 | 1 | |
| 257 | 1 | |
| 261 | 2 | |
| 262 | 1 | |
| 263 | 1 |
| Value | Count | Frequency (%) |
| 2523 | 1 | |
| 2522 | 1 | |
| 2519 | 1 | |
| 2515 | 1 | |
| 2494 | 1 | |
| 2480 | 1 | |
| 2475 | 1 | |
| 2465 | 1 | |
| 2452 | 1 | |
| 2434 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| CO(GT) | PT08.S1(CO) | C6H6(GT) | PT08.S2(NMHC) | NOx(GT) | PT08.S3(NOx) | NO2(GT) | PT08.S4(NO2) | PT08.S5(O3) | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2.6 | 1360.0 | 11.9 | 1046.0 | 166.0 | 1056.0 | 113.0 | 1692.0 | 1268.0 |
| 1 | 2.0 | 1292.0 | 9.4 | 955.0 | 103.0 | 1174.0 | 92.0 | 1559.0 | 972.0 |
| 2 | 2.2 | 1402.0 | 9.0 | 939.0 | 131.0 | 1140.0 | 114.0 | 1555.0 | 1074.0 |
| 3 | 2.2 | 1376.0 | 9.2 | 948.0 | 172.0 | 1092.0 | 122.0 | 1584.0 | 1203.0 |
| 4 | 1.6 | 1272.0 | 6.5 | 836.0 | 131.0 | 1205.0 | 116.0 | 1490.0 | 1110.0 |
| 5 | 1.2 | 1197.0 | 4.7 | 750.0 | 89.0 | 1337.0 | 96.0 | 1393.0 | 949.0 |
| 6 | 1.2 | 1185.0 | 3.6 | 690.0 | 62.0 | 1462.0 | 77.0 | 1333.0 | 733.0 |
| 7 | 1.0 | 1136.0 | 3.3 | 672.0 | 62.0 | 1453.0 | 76.0 | 1333.0 | 730.0 |
| 8 | 0.9 | 1094.0 | 2.3 | 609.0 | 45.0 | 1579.0 | 60.0 | 1276.0 | 620.0 |
| 9 | 0.6 | 1010.0 | 1.7 | 561.0 | 178.0 | 1705.0 | 109.0 | 1235.0 | 501.0 |
Last rows
| CO(GT) | PT08.S1(CO) | C6H6(GT) | PT08.S2(NMHC) | NOx(GT) | PT08.S3(NOx) | NO2(GT) | PT08.S4(NO2) | PT08.S5(O3) | |
|---|---|---|---|---|---|---|---|---|---|
| 8981 | 0.5 | 888.0 | 1.3 | 528.0 | 77.0 | 1077.0 | 53.0 | 987.0 | 578.0 |
| 8982 | 1.1 | 1031.0 | 4.4 | 730.0 | 182.0 | 760.0 | 93.0 | 1129.0 | 905.0 |
| 8983 | 4.0 | 1384.0 | 17.4 | 1221.0 | 594.0 | 470.0 | 155.0 | 1600.0 | 1457.0 |
| 8984 | 5.0 | 1446.0 | 22.4 | 1362.0 | 586.0 | 415.0 | 174.0 | 1777.0 | 1705.0 |
| 8985 | 3.9 | 1297.0 | 13.6 | 1102.0 | 523.0 | 507.0 | 187.0 | 1375.0 | 1583.0 |
| 8986 | 3.1 | 1314.0 | 13.5 | 1101.0 | 472.0 | 539.0 | 190.0 | 1374.0 | 1729.0 |
| 8987 | 2.4 | 1163.0 | 11.4 | 1027.0 | 353.0 | 604.0 | 179.0 | 1264.0 | 1269.0 |
| 8988 | 2.4 | 1142.0 | 12.4 | 1063.0 | 293.0 | 603.0 | 175.0 | 1241.0 | 1092.0 |
| 8989 | 2.1 | 1003.0 | 9.5 | 961.0 | 235.0 | 702.0 | 156.0 | 1041.0 | 770.0 |
| 8990 | 2.2 | 1071.0 | 11.9 | 1047.0 | 265.0 | 654.0 | 168.0 | 1129.0 | 816.0 |